An Empirical Study of the Maximal and Total Information Coefficients and Leading Measures of Dependence

نویسندگان

  • DAVID N. RESHEF
  • YAKIR A. RESHEF
  • PARDIS C. SABETI
  • MICHAEL MITZENMACHER
چکیده

In exploratory data analysis, we are often interested in identifying promising pairwise associations for further analysis while filtering out weaker ones. This can be accomplished by computing a measure of dependence on all variable pairs and examining the highest-scoring pairs, provided the measure of dependence used assigns similar scores to equally noisy relationships of different types. This property, called equitability and previously formalized, can be used to assess measures of dependence along with the power of their corresponding independence tests and their runtime. Here we present an empirical evaluation of the equitability, power against independence, and runtime of several leading measures of dependence. These include the two recently introduced and simultaneously computable statistics MICe, whose goal is equitability, and TICe, whose goal is power against independence. Regarding equitability, our analysis finds that MICe is the most equitable method on functional relationships in most of the settings we considered. Regarding power against independence, we find that TICe and Heller and Gorfine’s SDDP share state-of-the-art performance, with several other methods achieving excellent power as well. Our analyses also show evidence for a trade-off between power against independence and equitability consistent with recent theoretical work. Our results suggest that a fast and useful strategy for achieving a combination of power against independence and equitability is to filter relationships by TICe and then to rank the remaining ones using MICe. We confirm our findings on a set of data collected by the World Health Organization.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Supplier Development Activities and Buying Firm’s Performance: An Empirical Investigation of Iranian SMEs

This study attempts to investigate the major antecedent factors that influence manufacturing SMEs intentions toward the implementation of supplier development activities in Iranian SMEs. In order to achieve this objective, the research constructs were developed. The conceptual framework underlying this study was based on the theories of supplier development activities and social capital.  These...

متن کامل

Measures of maximal entropy

We extend the results of Walters on the uniqueness of invariant measures with maximal entropy on compact groups to an arbitrary locally compact group. We show that the maximal entropy is attained at the left Haar measure and the measure of maximal entropy is unique.

متن کامل

Parameter Estimation of Some Archimedean Copulas Based on Minimum Cramér-von-Mises Distance

The purpose of this paper is to introduce a new estimation method for estimating the Archimedean copula dependence parameter in the non-parametric setting. The estimation of the dependence parameter has been selected as the value that minimizes the Cramér-von-Mises distance which measures the distance between Empirical Bernstein Kendall distribution function and true Kendall distribution functi...

متن کامل

Continuous dependence on coefficients for stochastic evolution equations with multiplicative Levy Noise and monotone nonlinearity

Semilinear stochastic evolution equations with multiplicative L'evy noise are considered‎. ‎The drift term is assumed to be monotone nonlinear and with linear growth‎. ‎Unlike other similar works‎, ‎we do not impose coercivity conditions on coefficients‎. ‎We establish the continuous dependence of the mild solution with respect to initial conditions and also on coefficients. ‎As corollaries of ...

متن کامل

Dependence of Default Probability and Recovery Rate in Structural Credit Risk Models: Empirical Evidence from Greece

The main idea of this paper is to study the dependence between the probability of default and the recovery rate on credit portfolio and to seek empirically this relationship. We examine the dependence between PD and RR by theoretical approach. For the empirically methodology, we use the bootstrapped quantile regression and the simultaneous quantile regression. These methods allow to determinate...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017